Small Language Models (SLMs)
SLMs are AI models capable of processing, understanding and generating natural language content. As their name implies, SLMs are smaller in scale and scope than LLMs.
See:
Resources
- https://github.com/slashml/awesome-small-language-models
- What are Small Language Models (SLM)? | IBM
- Model compression techniques are applied to build a leaner model from a larger one. Compressing a model entails reducing its size while still retaining as much of its accuracy as possible. Here are some common model compression methods:
- Pruning: Removes less crucial, redundant or unnecessary parameters from a neural network
- Quantization: Converts high-precision data to lower-precision data
- Low-rank factorization: Decomposes a large matrix of weights into a smaller, lower-rank matrix. This more compact approximation can result in fewer parameters, decrease the number of computations and simplify complex matrix operations
- Knowledge distillation: Involves transferring the learnings of a pretrained “teacher model” to a “student model.” The student model is trained to not only match the teacher model’s predictions but also mimic its underlying process of reasoning
- Model compression techniques are applied to build a leaner model from a larger one. Compressing a model entails reducing its size while still retaining as much of its accuracy as possible. Here are some common model compression methods:
- Small Language Models (SLMs) - 2024 overview | SuperAnnotate
- A Deep Dive into Small Language Models: Efficient Alternatives to Large Language Models for Real-Time Processing and Specialized Tasks - MarkTechPost
- Training-Small-Language-Model/Training_a_Small_Language_Model.ipynb at main · AIAnytime/Training-Small-Language-Model · GitHub
- HuggingFace local language models
Leaderboards
- Open LLM Leaderboard - a Hugging Face Space by open-llm-leaderboard - Filtered by size to focos on SLMs
- Best Small Language Models (SLMs): Choose From the List of Top OpenSource SLMs With less than 3B params
Models
- Meta-llama/Llama-3.2-3B-Instruct · Hugging Face
- SmolLM2 - a HuggingFaceTB Collection
- Qwen/Qwen2.5-3B-Instruct · Hugging Face
- Qwen/Qwen2.5-Coder-14B-Instruct · Hugging Face
- LGAI-EXAONE/EXAONE-3.5-2.4B-Instruct · Hugging Face
- microsoft/Phi-3.5-mini-instruct · Hugging Face
- tiiuae/Falcon3-3B-Instruct · Hugging Face
Quantization
- Maxime Labonne - Quantize Llama models with GGUF and llama.cpp
- LLM Quantization | GPTQ | QAT | AWQ | GGUF | GGML | PTQ
- Quantization of LLMs with llama.cpp | by Ingrid Stevens | Medium
Small VLMs
See VLMs
- SmolVLM - small yet mighty Vision Language Model
- new family of 2B small vision language models that can be used commercially and deployed to smaller local setups, with completely open training pipelines
- smollm/finetuning/Smol_VLM_FT.ipynb at main · huggingface/smollm · GitHub
References
- #PAPER #REVIEW Small Language Models: Survey, Measurements, and Insights (2024)
- #PAPER #REVIEW A Survey of Small Language Models (2024)
- #PAPER #REVIEW A Comprehensive Survey of Small Language Models in the Era of LLMs: Techniques, Enhancements, Applications, Collaboration with LLMs, and Trustworthiness (2024)
- #PAPER MiniLLM: Knowledge Distillation of Large Language Models (2024)